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Abstract 

Background: Competition between spermatozoa from rival males for success in fertilization (i.e., sperm competition) 
is an important selective force driving the evolution of male reproductive traits and promoting positive selection in 
genes related to reproductive function. Positive selection has been identified in reproductive proteins showing rapid 
divergence at nucleotide level. Other mutations, such as insertions and deletions (indels), also occur in protein-coding 
sequences. These structural changes, which exist in reproductive genes and result in length variation in coded proteins, 
could also be subjected to positive selection and be under the influence of sperm competition. Catsperl is one such 
reproductive gene coding for a germ-line specific voltage-gated calcium channel essential for sperm motility and 
fertilization. Positive selection appears to promote fixation of indels in the N-terminal region of CatSperl in mammalian 
species. However, it is not known which selective forces underlie these changes and their implications for sperm 
function. 

Results: We tested if length variation in the N-terminal region of CatSperl is influenced by sperm competition 
intensity in a group of closely related rodent species of the subfamily Murinae. Our results revealed a negative 
correlation between sequence length of CatSperl and relative testes mass, a very good proxy of sperm competition 
levels. Since CatSperl is important for sperm flagellar motility, we examined if length variation in the N-terminus of 
CatSperl is linked to changes in sperm swimming velocity. We found a negative correlation between CatSperl 
length and several sperm velocity parameters. 

Conclusions: Altogether, our results suggest that sperm competition selects for a shortening of the intracellular region 
of CatSperl which, in turn, enhances sperm swimming velocity, an essential and adaptive trait for fertilization success. 
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Background 

Many genes involved in sexual reproduction evolve rap- 
idly, often as a result of adaptive evolution or positive se- 
lection [1]. Several selective forces have been proposed 
to drive the adaptive evolution of reproductive genes, 
including sperm competition, female cryptic choice, or 
sexual conflict [2-4]. Sperm competition arises when 
multiple males copulate with a female in a polyandrous 
system. As a consequence, the ejaculate of each male 
competes with those of other males to be the first to 
fertilize the egg(s) [4,5]. Sperm competition generates 
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postcopulatory sexual selection influencing reproductive 
traits to increase the success of an ejaculate at fertilizing 
under competitive conditions [4,6]. 

A critical determinant of the outcome of sperm com- 
petition is the relative number of spermatozoa provided 
by different males. Males respond to sperm competition 
by increasing sperm numbers, which is achieved by an 
increase in testes mass relative to body mass [4]. Relative 
testes mass associates to levels of sperm competition in 
many taxa [4,7-9] and, thus, is widely used as a reliable 
index of levels of sperm competition. Sperm swimming 
velocity is also a main determinant of fertilization success 
[10-13]. An increase in the size of sperm components 
is generally associated to increases in sperm swimming 
velocity [14-17]. 
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Rapid divergence of reproductive proteins as a result 
of adaptive evolution may be linked to adaptive changes 
in reproductive traits under competitive conditions [1]. 
However, despite the potentially pervasive influence of 
sexual selection in driving adaptation at the molecular 
level, only a small set of studies has so far found an asso- 
ciation between levels of sperm competition and rates of 
molecular evolution in reproductive genes [18-20]. In 
most studies searching for adaptively evolving proteins, 
the signature of positive selection has been tested es- 
timating the ratio between nonsynonymous (amino 
acid-replacing) to synonymous (silent) nucleotide substitu- 
tions (dN/dS). Other forms of genetic variation different 
from nucleotide substitutions have received less attention 
in studies of adaptive evolution. One of these alternatives, 
i.e., structural variation, is widespread in animal genomes 
[21,22] and structural variants may shape the variation in 
phenotypic traits between individuals [23]. Therefore, 
positive selection might also act setting structural variants 
in gene coding sequences which may be advantageous 
for certain physiological traits. Insertions and dele- 
tions (indels) have been shown to be subjected to positive 
selection in Drosophila accessory gland proteins [24]. Indel 
substitutions are the most abundant structural variants in 
the mammalian genome [21] and previous studies have re- 
ported that some mammalian reproductive proteins show 
an elevated fixation of indels [25-27], suggesting that posi- 
tive selection may play a role in the evolutionary change of 
protein length. 

CatSper is a sperm-specific Ca^^ ion channel that is 
exclusively found in the plasma membrane of the principal 
piece of the mammalian sperm flagellum, and presumably 
forms a heterotetrameric, pH- and voltage-dependent 
Ca^ "^-permeable channel [28]. CatSper channel allows 
Ca^"^ influx into the sperm flagellum, which is important 
for sperm motility at different stages in the life of sperm- 
atozoa during their transit along the female tract to the 
site of fertilization. Differences may exist between species 
with regards to the timing and roles of CatSper activa- 
tion, or in the mechanisms underlying motility regulation 
[29-32]. Thus, some evidence suggests that CatSper may 
be related to motion through viscous media such as that 
observed in the cervix or, perhaps, the utero-tubal junc- 
tion, but not in later events during transit in the oviduct 
[31], and that CatSper channels are key elements of 
major Ca^"^ entry pathways during basal (the so-called 
activated) motility [30]. On the other hand, targeted 
disruption of CatSper channel subunits results in a lack 
of a more vigorous (so-called hyperactivated) motility in 
mouse spermatozoa after incubation in conditions that 
prepare them for fertilization (i.e., capacitation) [29]. Mu- 
tant male mice have spermatozoa with reduced sperm 
velocity parameters after incubations and they are infer- 
tile [29]. This has led to the conclusion that CatSper is 



essential for sperm hyperactivation, which takes place in 
the oviduct, and is necessary for ova penetration during 
fertilization [28] . It is possible that CatSper activation can 
elicit functionally different behaviors according to extra- 
cellular Ca^^ concentrations and to the sensitivity of the 
sperm Ca^"^ stores [31]. Altogether, because CatSper is 
important for sperm motility, and is required for male 
fertility, this channel is a promising candidate for a key 
involvement in sperm competition. 

A high number of indel substitutions have been favored 
by positive selection in the first exon of the Catsperl 
gene, which codes for the intracellular N-terminus of 
the CatSper channel [25,26]. Nevertheless, the selective 
forces underlying this high number of indels have not been 
identified. Although a clear function for this CatSperl 
N-terminus has not yet emerged, it is possible that 
the length of this region might affect the regulation of 
the CatSper channel. In such case, the structural variation 
of the N-terminus of CatSperl could affect sperm flagellar 
motility and, as a consequence, influence sperm swimming 
velocity, which is a major determinant of reproductive suc- 
cess in sperm competition. 

In this study, we examined whether an elevated rate of 
indels in the N-terminus of the CatSperl sequence has 
an adaptive value in terms of sperm competition in ro- 
dents. First, we tested whether the indel-related length 
variation of the N-terminal region of CatSperl is associ- 
ated to different levels of sperm competition. Second, 
given the vital role of the CatSper channel in sperm 
movement, and that sperm swimming speed is an im- 
portant trait for fertilization, we assessed if changes in 
length of CatSperl N-terminus are linked to phenotypic 
changes in sperm velocity parameters. Third, we ana- 
lyzed whether sperm competition may promote episodes 
of positive selection at the nucleotide level. Fourth, since 
the amino terminus of CatSperl may be involved in pH 
regulation of Catsper channel activity, due to its remark- 
ably high content of histidine residues, we investigated 
whether the amount of histidines produced both by 
structural and molecular changes are associated with 
sperm competition and phenotypic adaptations. To this 
end, we sequenced the first exon of the Catsperl gene 
in several species belonging to the subfamily Murinae, 
assessed whether nucleotide and structural variations 
within this region may be driven by sperm competition, 
and examined possible associations between sequence 
length and sperm swimming velocity. 

Methods 

Species 

Our study included a total of 16 rodent species belonging 
to the subfamily Murinae and comprising flve genera: Mus 
spretus, Mus spicilegus, Mus macedonicus, Mus famulus, 
Mus caroli, Mus cookii, Mus pahari, Mus m, musculus, 
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Mus m. domesticus, Mus mxastaneus, Mus m. bactrianus, 
Mus minutoides, Mastomys natalensis, Apodemus sylvati- 
cus, Lemniscomys barbarus and Rattus norvegicus. This 
group of species covers a wide range of levels of sperm 
competition. Males of Mus species were purchased from 
the Institut des Sciences de TEvolution, CNRS-Universite 
Montpellier 2, France. Apodemus sylvaticus males were 
caught in the wild during the breeding session (Permit 
number 8688/02 from Consejeria de Medio Ambiente, 
Comunidad de Madrid). Males of Mastomys natalensis 
and Lemniscomys barbarus come from wild-derived 
colonies which have been kept in captivity for only a 
few generations. Animal handling and housing followed 
the standards of the Spanish Animal Protection Regula- 
tion RD 120 1/2005, which conforms to European Union 
Regulation 2003/65. Animals were used complying with 
the Convention of Biological Diversity and the Conven- 
tion on the Trade in Endangered Species of Wild Fauna 
and Flora. This study was approved by the Bioethics 
Committee of the Consejo Superior de Investigaciones 
Cientificas (CSIC, Spain). 

Catsperl sequences 

The first exon of Catsperl gene was amplified by poly- 
merase chain reaction (PCR). PCR primers were de- 
signed based on the Catsperl sequences published in 
the NCBI GenBank (http://www.ncbi.nlm.nih.gov) for 
Mus musculus and Rattus norvegicus (accession num- 
bers NM_139301 and XM_001070492, respectively) as 
well as the genomic data for multiple mouse strains 
available from the Sanger Institute database (http:// 
www.sanger.ac.uk/). To design reverse primers, we also 
used Catsperl coding sequences reported previously [26] 
for Mus species (GenBank accession numbers DQ021482- 
DQ021500). These sequences were not employed to design 
forward primers because they started downstream of the 
start codon. PCR mixtures were prepared in a 50 \A vol- 
ume containing PCR Gold buffer Ix (Roche, Barcelona, 
Spain), 2.5 mM MgCl2 (Roche), 0.8 mM dNTPs mix 
supplying 0.2 mM of each deoxinucleotide triphosphate 
(Applied-Biosystems, Barcelona, Spain), 0.3 mM of for- 
ward and reverse primers (Life Technologies, Madrid), 
2 U of DNA polymerase (Biotools, Madrid), and 20- 
200 ng/(il of genomic DNA template. All PCRs were 
performed in a Veriti thermocycler (Applied-Biosystems). 
The conditions of the thermocycler program consisted of 
35-45 cycles with an initial denaturation of 95°C for 
30-40 s, an annealing stage at 58-62°C (depending on 
folding temperature of primers) for 60 s, and an elongation 
stage at 72°C for 80 s. PCR products were purified by using 
the E.Z.N. A.® Cycle Pure kit (Omega). Purified products 
were usually sequenced directly (Secugen S.L., Madrid, 
Spain). Products with problematic sequencing were cloned 
using pGEM®-T Vector System (Promega, Madrid, Spain) 



following the protocol provided by the manufacturer. The 
first exon of Catsperl was sequenced for at least 3 individ- 
uals per species in order to generate a consensus sequence. 
Catsperl sequences reported earlier [33] were not used to 
avoid polymorphisms due to different source populations. 

Alignments and trees 

Processing and correction of sequences were performed 
using the sequence viewer and alignment editor BioEdit. 
Sequences from several individuals belonging to the same 
species were used to generate a consensus sequence per 
species. Consensus sequences were bound to the cod- 
ing sequence using as reference the sequence of Mus 
musculus retrieved from NCBI GenBank (accession num- 
ber NM_139301). Nucleotide sequences were aligned 
using the algorithm ClustalW implemented in BioEdit. To 
test the robustness of the alignment, we performed repeti- 
tive ClustalW varying penalty parameters for gap opening 
and gap extension. Those regions in which indels lead to 
inaccurate alignments were manually edited. Nucleotide 
coding sequences were translated to amino acid sequences 
and the correct frame was checked using the protein se- 
quence of Mus musculus, retrieved from NCBI GenBank 
(accession number NP_647462), as well as those translated 
sequences from Podlaha et al. [33]. Amino acid sequences 
were aligned through ClustalW. 

CatSperl phylogenies were reconstructed using Neighbor- 
Joining (MEGA 5.03) and Maximum Likelihood (PhyML) 
methods. Statistical selection of best-fit model of nu- 
cleotide substitution was performed by JModelTest 
software [33]. For evolutionary analyses we used an in- 
put tree comprising our range of species on the basis 
of well resolved phylogenies for rodents (see Results 
section) [34-37]. 

Tests for positive selection 

We used the nonsynonymous/synonymous substitu- 
tions ratio (o) = dN/dS) as an indicator of selective pres- 
sure at the protein level, with co = 1 indicating neutral 
evolution, co < 1 purifying selection, and co > 1 diversify- 
ing positive selection. To estimate rates of sequence evo- 
lution we used the application Codeml implemented in 
the PAML 4 package. In order to detect variable select- 
ive pressures in the first exon of Catsperl and infer resi- 
dues under positive selection we applied models that 
account for heterogeneous co ratios among amino acid 
sites [38]. We compared a null model that does not 
allow sites with co >1 with a selection model that does 
through likelihood ratio tests. We used two kinds of 
likelihood ratio tests. The first compared a nearly neutral 
model Mia, which assumes values for co between 0 
and 1, with a model M2a which allows values of co > 1. 
The second test is more refined and compares two 
models assuming a beta distribution for co values. In this 
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case, the null model M7 that limits co between 0 and 1 is 
compared to the alternative model M8, that adds an extra 
class of sites with an co ratio estimated to be greater 
than 1. We also used a third test comparing the M8 
model with the null hypothesis M8a, which fixes co 
to 1 instead of estimating an additional class of sites, 
reducing false positives [39]. These tests compare twice 
the log-likelihoods of the alternative and the null- 
model to critical values from a chi-square distribution 
with the degrees of freedom equal to the difference in 
the number of parameters between the two models. If 
the alternative models showed a significantly better fit 
in the likelihood-ratio-test, Bayes empirical Bayes (BEB) 
analysis [40] was used to infer positively selected sites with 
posterior probabilities higher than 0.95 under both model 
M2a and MS. 

To adjust codon frequency and number of gamma cat- 
egories, we ran repetitive analysis varying the values of 
these parameters and we used the setting with the best 
fit according to the likelihood values of models. 

Estimation of lineage-specific evolutionary rates 

Site-analysis estimates variable co ratios among sites and 
identifies residues under putative positive selection. 
Nonetheless, these models assume that the evolutionary 
parameters are invariant across the lineages of the phyl- 
ogeny. We then calculated the evolutionary rates for 
each branch of the phylogeny using the Codeml free 
branch model [41]. Omegas were estimated for each 
lineage by adding dN and dS values from the root to 
the respective terminal branch and calculating the ra- 
tio of the sums. By calculating co ratios from the root 
of the tree we considered the total accumulated se- 
lective pressures in Catsperl during their evolution, 
which is more suitable for testing relationships against 
phenotypic data which do reflect the whole phenotypic 
evolution from the common ancestor [42]. In addition, 
estimating evolutionary data since the last common an- 
cestor forces all branches to have the same length and 
therefore the analysis is not subject to temporal effects 
on dN/dS. 

Analysis of indel substitutions 

Indels produced in the first exon of Catsperl were coded 
in the alignment by SeqState 1.41 software using a modi- 
fied complex coding scheme. Events of indel substitu- 
tions were then inferred using the parsimony principle 
and considering the phylogenetic position of the species. 
This implies that, in cases where multiple equally parsi- 
monious solutions for an indel were found, the first 
indel was assumed to happen in the split between the 
common ancestor of a clade with preponderance of an 
indel and the closest species carrying the indel variant. 
If the same indel variant was observed in any other 



species within the clade, such indel was considered 
homoplasious by evolutionary convergence. The species 
Cricetulus griseus was used as outgroup to infer indels 
occurring between Rattus norvegicus and remaining line- 
ages. Indels were identified either as deletions or inser- 
tions and they were mapped onto the species tree (see 
Results section). 

Length of the N- terminus of CatSperl was calculated 
for each species as the total number of amino acids after 
checking that no indel is produced in the flanking re- 
gions and thus assuming that variations in sequence 
length are exclusively driven by internal indels. 

Analysis of histidines 

Total number and proportion of histidines were calcu- 
lated for each species to assess whether molecular and 
structural changes are promoting variations in the amount 
of this residue which is presumably important for the regu- 
lation of the CatSper channel. 

Sperm competition and sperm velocity parameters 

Relative testes mass has been widely used as a reliable 
indicator of sperm competition levels in analyses of the 
evolution of ejaculate traits [9] and reproductive genes 
[18,20,43]. To obtain relative testes mass, males (N = 5 
for each species) were sacrificed by cervical dislocation 
and weighed. After removal, the testes were weighed and 
measured. Values of relative testes mass for Rattus nor- 
vegicus were taken from the literature [44]. Mean relative 
testes mass values were calculated using the regression 
equation for rodents [44] (Additional file 1). 

Sperm velocity parameters were measured using a 
computer-assisted sperm analyzer (Sperm Class Analyzer 
v.4.0, Microptic, Barcelona, Spain). A total of 5 \A of 
sperm suspensions was placed in a 20-(im deep sUde 
chamber (Standard Count-2 Chamber Slide 20-micron, 
Leja, Nieuw-Vennep, Netherlands) pre-warmed to 37°C, 
and examined using phase contrast microscopy with 
a 4x objective. Data on sperm velocity parameters 
were obtained within 5 min of sample collection for all 
individuals. 

Using a video camera (Basler A312fc, Vision Tech- 
nologies), up to eight videos of 4 s each were recorded 
for each males sperm sample. Sperm concentration 
was previously adjusted to 4-6 x 10^ sperm/ml to sat- 
isfy the requirements of the analysis. Videos were ana- 
lyzed and a minimum of 150 tracks were obtained for 
each male s sample, with N = 5 males analyzed for each 
species. 

Seven sperm velocity parameters were quantified: 
curvilinear velocity (VCL) (in (im/s), straight line vel- 
ocity (VSL) (in (im/s), average path velocity (VAP) 
(in (im/s), linearity (LIN) (in%), straightness (STR) (in%), 
amplitude of lateral head displacement (ALH) (in (im/s) 
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and beat cross frequency (BCF) (in Hz). To reduce 
potentially correlated components of velocity to a single 
factor summarizing the information, we performed a 
principal component analysis (PC A). The seven velocity 
descriptors were used as variables in the PC A, rendering 
two principal components that accounted for 70% (PCI) 
and 22% (PC2) of total variability. The three principal 
sperm velocity parameters (VCL, VSL, and VAP) showed 
a significant positive correlation with PCI and no correl- 
ation with PC2, and thus PCI was interpreted as the 
global measure of sperm velocity (hereafter referred as 
"overall sperm velocity"). 

Statistical analyses 

Possible relationships between the evolution of the 
N-terminal region of CatSper 1 and phenotypic and 
ecological adaptations were evaluated through linear 
regression analyses. To test whether evolutionary dy- 
namics of N-terminus of CatSper 1 are associated with 
sperm competition, linear regression analyses were per- 
formed using (0 calculated from the root of the phyl- 
ogeny and sequence length as dependent variables and 
relative testis mass as predictor variable. The same 
dependent variables were used in multiple regression 
analyses with body mass and testes mass as predictor 
variables. In this case, since predictor variables are re- 
lated to each other, they were added to the multiple re- 
gression analysis in order, first body mass, and then 
testes mass, using a sequential (Type I) sum of squares. 
In order to search for relationships between CatSper 1 
evolution and sperm velocity, different descriptors of 
sperm swimming velocity (see above) were used as 
dependent variables in simple regression analysis with 
CO and CatSperl sequence length as predictor variables. 
The level of significance was adjusted to P < 0.05 for all 
tests. Mus m.bactrianus, Mus famulus and Mus cookii 
were not included in these analyses because of lack of 
data for these species. 

Since species may share character values as a result of 
a common ancestry rather than independent evolution, 
regression analyses were performed using a phylogenetic 
generalized least-squares (PGLS) approach [45]. This 
powerful method allows for a control of phylogenetic ef- 
fects on the associations between variables. PGLS ana- 
lyses were conducted using the CAPER package for the 
statistical environment R v.2.10.1 (R Development Core 
Team, 2011). Phylogenetic effects were controlled based 
on the tree topology and branch lengths were calculated 
under the MO model included in PAML. 

Results 

Catsperl sequences 

Nucleotide sequences of the first exon of the Catsperl 
gene were obtained for 16 murid species and aligned 



according to their coding sequences. Multiple ClustalW 
alignment spanned 1080 nucleotides and revealed a high 
sequence divergence as well as an elevated number of 
gaps in Catsperl (see Additional file 2). Sequences 
among species varied in length from 879 to 957 nucleo- 
tides. Amino acid alignment of the N-terminal region of 
CatSperl is shown in Figure 1. 

Catsperl phylogenies built by Neighbor-Joining and 
Maximum Likelihood approaches (Additional file 3) 
showed almost identical topology, but they showed some 
differences when compared to the species tree (Figure 2), 
which suggests that CatSperl may be subjected to select- 
ive forces that would alter the evolutionary pattern ex- 
pected from the phylogenetic relationships among the 
species. A total of 58 parsimony-inferred indel substitu- 
tions in the first exon of Catsperl were mapped onto 
the species tree (Figure 2). A total of 50 of these indels 
were unique and congruent with the tree topology and 8 
were homoplasious (Figure 2). No indel polymorphism 
was found between individuals, by which we assume that 
all consensus sequences are representative of each spe- 
cies. A total of 35 indels fell in terminal branches and 
ranged from none for Mus cokii, Mus spretus, Mus spici- 
legus, Mus macedonicus, Mus m. domesticus and Mus m, 
castaneus to 9 for Mastomys natalensis (Figure 2). A 
total of 34 deletions were detected throughout the phyl- 
ogeny against 24 insertions. 

All indels identified in the alignment spanned lengths 
of 3n nucleotides (see Figure 2 and Additional file 2), 
and thus the reading frame remained intact in all se- 
quences. The number of observed indels in the first 
exon of Catsperl was significantly higher than both the 
genomic average and Catsperl locus neutral indel sub- 
stitution rates (see [33] for methodology). 

Tests for positive selection 

A high number of amino acid replacements was ob- 
served in CatSperl N-terminus in addition to indels 
(Figure 1), suggesting that positive selection could also 
be promoting molecular variation in Catsperl, Robust 
evidence of positive selection was detected in the first 
exon of Catsperl when likelihood values of M2a and M8 
selection models were compared with the corresponding 
values of Mia, M7 and M8a neutral models (Table 1). In 
both cases, likelihood ratio tests rejected the neutral 
models. A total of 10 and 9 significant positively selected 
sites were identified with a Bayesian posterior probability 
of Pb > 0-95 under M2a and M8 models respectively 
(Table 1, Figure 1). These analyses reveal that the N- 
terminal region of CatSperl is subjected to strong posi- 
tive selection also at the molecular level. 

We found that 4 out of 10 (40%) positively selected 
sites fell in positions containing histidines. Considering 
the possible role of histidine residues as a pH-sensor in 
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BPBPBPBP 

BPBPBP-- 

BPBPBP-- 

PSSSB-- PBPBPBP-- 
p BPBP-- 

BPBP-- 

P BPBPBP-- 

p BPBP-- 

BPBP-- 

BPBP-- 

PSSSB-- PHPBPBPBP 
PS PPB- - PnBPBP- - 

PSSS BPOPBP 

PS S S H PY PMY PHB P 

PPSSB PBPBPBP 

PPSS YPYPHP 



50 

^PSOGS BYYDS 
^PSOGG-WYYDS 
IPSQGG VYYDS 
^PNOGG- VYYDS 
^PNOGG VYYDS 
IPSQGG- VYYDS 
*PSOGG VYYDS 
IPSQGG VYYDS 
IPNQGG VYYDS 
IPNQGG VYYDS 
^PSOGG- VYYDS 
IPNQGG VYYDS 
IPSRSG-VHYBS 
IPSHSG VHYDS 
tPSYSG FHHB5 
%PSBGGVBHHBS 



60 70 
PQHGBFOO PYQOHGGFHOONB 
POHGBFOQ PYQOHGGFHOONB 
rQHGBFQQ F YQQHGG FHQQNB 



90 100 
QHBRBFSSSHimAFSHBSYQ' 
ORBRBFSDSHDNAFSHHSYO- 
QHBRBFSDSHDNAFSHHSYQ 



OHGBFOQFHOOHGGFHOONBFOHBRBFSDSHDNAFSHHSYQ- 
rOHGBFOQr YQQHGG FHOQNMUIIBIUFSDSHDNAFSHHSYQ 
FQHGHrQQ- YQQHGG FHQONHK}HBRBFS OSHDNAFSHHS YQ- 
■ QHGBFOO: YQQHGG FHQQNHKHBRBFSOSHDNAFSHHSYQ 
iQHGBFQQF YQQHGG FHQQNBFQHBRBFSDSHDNAFSHHSYO- 
PQHGBFQO FHQQHGGFHQQNBFQHXlRBr SDSHDNAFSHHSYQ 
PQHGBFOO FYQQHGGFHQQNBFQHJiRBFSDSHDNAFSHHSYO 
FQHGBBOO FYQQHGGVHQQNB:- QHZRBFS DFHDNAFSHHS YQ- 
rOHGBFOQFYQQHGGFH i QNBFQHVRBFS DFH DNAFSHYSHO 
FQHGTFQO FSQQHAG FHQQNBSQHIIiRBr SN FH DSAFSHHSHQ- 
FQHGBFQQS IQHHAGF rQQHBSQHiBRBFS DFHBSAISHHSYQ 
; QHGHr 00 5 SQQHAGFHQQNBSQHURBFS DFHDSABSHHSYQO 
rOHGYS FQ FSQQHAG FYOONBSOHBRG S DFHDSAF FHHS YH 



130 

PBABSQHS6G 
PBABSOHSGG 
PBABSOHSGG 
PBABSOHSGG 
PBABSOHSGG 
PBABSOHSGG 
PBABSOHSGG 
PBABSOHSGG 
PBABSOHSGG 
PBABSOHSGG 
PBABSOHSGG 
FBABSQHSGG 
FVABSQOSGG 
F PGBSQHSGG 
PHABSOHSGG 
PBABSOHSGG 



230 



PQSGPRIDFNHHFHODD 
POSGPRIDFNHHFHQDD 
FQSG PRIDFNHH -HODD 
FQSGFHIDINHR -HQAD 
FQSGFRIDINHHIHQDD 
FQSGFRIDl-NHH JHQDD 
FOSGPRIDINHHIHQDD 
FQSGFHIDINHHFHQDD 
POSG FHIDFNHR FHQAD 
FQSGFHIDFNHHFHQAD 
FQSGF FHODD 
FQSGFHIDSNHHFHODD 

F H rSHQDD 

I>OSGSBIDFHHH FHQDN 
POSGSHZBPSHHSYO 
PODGSHIB SYHSH06N 
240 



150 

FHR PS I 
HR PSI 
FHRPSI 
FRR PSI 
HR PS 
FHR PS 
FHR PSI 
PHR PSI 
PRR PSI 
PRRPSI 
PHR PSI 
FHR PSI 
•R PSI 
FRAS 
PRRPSI 
PRKPS 
250 



160 

PflSBPSS TGSHOGT 
BPBSBPSSTGSHQG 
BPBSHPSS TGSHOGT 
BPBSHHBSTGTHQGT 
BPBSHPSS TGSHOGT 
BPBSHPSS TGSHOGT 
PBSH PS S TGSHOGT 
PBSHPSS TGSHOGT 

■ PBSRHSTGTHQGT 

■ PBSH PBS TG THOG T 
BPSSTGSHOG T 

■ PBS B PS S T6S HOG T 
■ABSOPSSAGGRYAQ 

p rbgt 

■ABSOPS- TSTRYBR 
BABSBPSSBSSBOGK 

260 



170 y y 180 190 2^ 

HQQYHBRSHHBM PQQNRDB- - -ABTBSYRSS 
THQQYHBRSHHBN FQQNRSB- - -ABTBSYRSS 

THQQYHBRSHHBN FQQNRDH ADTSSYRSS 

THOOH FBRS OHBN FOON R DH Y - AONSS Y RS S 
HQQYYBRSHHI^ FQQNRDHY ADTZSYRSS 
THOQYYBRSHHBN FQQNR DHY - - ADTIS Y RSS 
THQQYYBRSHHBN FQQNRDHH ADTISFRSS 
THQQHYBRSHHBN FOONRDHY -ADTIS YRSS 
THOOHrBRSQHBN FQQNRDHY -ADNISYRSS 
THOOHrBRSOHBIJ FQQNRDHY -ADTIS YRSS 
TH RHHBREHHBN FOQDROQSOQADTXS YRSN 
TH OHYDRSDYBNBOKNRBRH ADAISYRSN 
PH - - HHBR PHHBD PQQN R R OS OBS BN I S H R S ■ 

t- - -hybrsshbdooonrrosonabnzshhsn 

TB - BBNBR POBBO POON R R OS OHA ON XS H R S ■ 
TB- BHYBR POYBO PROS RROS DBS ONBSHRSG 



HHAH 


SHH 


GB 


HHAH 


SHH 


GB 


HHAH 


SHH 


GB 


HHAH 


SHR 


GB 


HHAH 


SHH 


GB 


HHAR 


SHH 


GB 


HHAH 


SHH 


GB 


HHAH 


SHH 


SB 


HHAH 


SHR 


GB 


rRHBGHHBGHHAH 


-SHH 


GB 


; ORAH 


SHR 


GB 



270 

■ PBBKBQR- 



-HFHHKBQR 
-H FHHKBOR 
-H FHHKBQR 



210 

PFSROBP :H£HADHHHBG 
PFSROBR : HXHADHHHB6 
PFSROBR FHZHADHHHBG 
PS S R QBR PH BH A DH H H BG 
PFSROBR PHBHADHHHBG 
PFSROBR PHBHADHHHBG 
PFRHOBR PHBHADHHHBG 
PFSROBR PHBOADHHHBG 
PSSROBR PHBHADHHHBG 

HHKBOR - 
HHRBORHY 

- -HBGR PR FTADFQHBG QBAHAHAHRRGB H FHHRB 

PABBHBGHRVHGDQRSBR QM RHHHRBOR 

OBG SHHQVDHPG6 BQABSBR-GBBBMBBHHBMBBH HHRBQR 

BBRSRVHADSHOAG BKAR -Q-QB QR- - 

QBRSGBQWGQOOBG SRASSKR-GB 



H FHHKBOR 



290 y 

hhhihhr-s 
hhhihhr s 
hhhihhr s 
qrhihhh s 
hhhihhr s 
hhhihhr s 
uhhihhh s 
hhhihhh s 
qrhihhh s 
qrhihhh s 
hyhgohbqrrbrihhhrs 
hhb qrrihhh s 
hhrgdrvhhrdrfhhr s 
hhrgghb ohrbhhr t 
rhh60r tbrrbhzhhr-h 
rrrgoybkykdrxrbrr 



280 
HYHGOHB 
HYHG DHB 
HYHGDHB 
HYHG DHB 
HYQGDHT 
HYQGDHT 

hyhgdh: 

RYHGD 

HYHGDHB 

HYHGDHB 



300 
PSASO 
PSASO 
PSASO 
ESASQ 
FSASQ 
FSASQ 
FSASQ 
FSASQ 
FSASQ 
FSASQ 
FSASQ 
FS TSQ 
PS ISO 
PSVSO 
PS TSO 
PSASO 



320 ▼'^?% T 

KS TASGARS TFGARSOBFGKAOSRBSBRBSASBSBSBDHB 
KS TASGARS TFGARSOIFGKAOSRBSBRBSASBSBGBDHV 
KS TASGARS TFGARSQIFGKAQSRBSBRBSASBSBGBOHV 
KSTASGARSNFGARSHZFGKAQSRBSBRBSASBSBGBDHV 
KS TASGARS TFGARSHZFGKAQSRBSBRBSASBSBGBDHV 
KS TASGARS TFGARSHIFGKAQSRBSBRBSASBSBGBDHV 
KS TASGARS TFGAHSHIFGKAQS RBSBRBSASBSBGBDHV 
KS TASGARS TFG TRSHZFGKAQSRBSBRBSASBSBGBDHV 
KSTASGARSNFGARSHZFGKAOSRBSBRBSASBSBGBDHV 
KSTASGARSNFGAHSHIFGKAQSRBSBKBSASBSBGBDHV 
KS TASGARS VFGGRSHVFSKTQSRBS BR BSASFSBGBDHV 
KSTASGARSAEGIRSRIFGKAOSSBSBKBSASBSBGBDHV 
KS TASGARS Y FG TRS RI FGKVHS RBSBKBS SBSBGBDHV 
KS TTSR PRSZBGARSHVFSQVHSKBSBRBS PSBSGBBDHV 
KS TASGARS FBRARSQASGRVHSKDSBKBSBSWSOmV 
K S BA S B PO S HBGVO S Y VS S 0 VH S GDS BKBS BS W S S - BOHK 



Figure 1 Amino acid alignment of N-terminal region of CatSperl. Translated sequences of 16 rodent species analyzed in this study. Dashes 
represent alignment gaps. Arrows on positions represent sites under putative positive selection with a Bayesian posterior probability >0.95 under 
M2a and M8 models. 



CatSperl activation, adaptive mutations on this domain 
may have important functional implications. 

CatSperl evolution and sperm competition 

Lineage-specific evolutionary rates (co) estimated under 
PAML free branch model for the first exon of Catsperl 
were greater than 1 in all cases except for Lemniscomys 
barbarus (Additional file 1), thus revealing that intense 
positive selection is acting on CatSper among these ro- 
dent species. To seek evidence for an influence of sperm 
competition on the evolutionary rate of CatSperl, we 
correlated lineage-specific co ratios with their respective 
relative testes mass values. Phylogenetic generalized least- 
squares (PGLS) analysis correcting for phylogenetic effects 



showed no significant associations of lineage-specific co 
with relative testes mass or testes mass corrected for body 
mass (Table 2). 

To test whether structural variation of CatSperl N- 
terminal region is influenced by sperm competition, cor- 
relations between length of this region and relative testes 
mass were examined. Because the PGLS test is a least 
squares-based regression analysis, and it is highly sensi- 
tive to violations of assumptions and outliers, we first 
searched for the presence of outliers in our dataset. We 
found that sequence length of Mastomys natalensis was 
very different from that of the other species (Additional 
file 4). This was mainly due to its specific 51 -bp long 
insertion produced as the result of a recent tandem 
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Rattus norvegicus 
Lemniscomys bar bar us 
Apodemus sytvaticus 
Mastomys natalensis 
Mus minutoides 
Mus pahari 
Mus cooKii 

24 

Mus caroli 
Mus famulus 
Mus spretus 
Mus spicilegus 
Mus macedonicus 

2.6 2i.i 

|— Mus m. bactrianus 



Mus m. domesticus 
Mus m. castaneus 
Mus m. musculus 



Figure 2 Phylogeny of the 16 rodent species analyzed in this study. Parsimony-informative indels are mapped tlirougliout tine tree. Filled 
rectangles indicate insertions and open rectangles indicate deletions. Indels are labeled according to region within the CatSperl sequence 
where indels were identified (see Additional file 2: Figure SI) followed by number of inserted/deleted nucleotides. Underlined labels indicate 
homoplasious indels. 



duplication containing 3 codons (see Figure 1, Additional of relative testes mass (i.e., with higher levels of sperm 

file 2). Therefore, we have not included this lineage in competition) presented a shorter N-terminus of CatSperl 

the analyses. A significant negative correlation was ob- whilst those with lower values of relative testes mass (low 

served between the length of CatSperl N-terminus and intensity of sperm competition) showed longer fragments 

relative testes mass (Table 2). Species with high values (Figure 3). 



Table 1 Tests of positive selection in CatSperl 



N 


Ls 


Best fit 
model 


Log-lil<elihood 
values^ 


Parameter estimates 


PSS'' 




16 


235 


Mia 


-2907.9808 


Po = 0.269, Pi = 0.731, oOo = 0.027, oOi = 1 


Not allowed 








M2a 


-2879.567332 


Po =0.186, Pi =0.651 P2 =0.163, oOo = 0, 
oOi = 1, 0)2 =4.67 


124H*, 128H*, 146S^ 147T*^ 154H*^ 
179 198T*^ 201 A**, 204Q** 


165S^ 






M7 


-2908.013308 


p =0.041, q =0.012 


Not allowed 








M8 


-2879.676624 


Po = 0.833, p = 0.01 66, q =0.0052, pi = 0.1 66, 
00 =4.535 


124H^ 146S*, 147T*^ 154H*^ 165S*, 
198T*^ 201 A*^ 204Q^* 


1 79 L*, 






M8a 


-2907.982933 


Po = 0.269, p = 2.758, q = 99.0, Pi = 0.731 , 
00 = 1.0 


Not allowed 





^Differences between log-likelihood values of models witli 99% statistical significance level for 2 degrees of freedom are shown as ^ 
''Positively selected sites (PSS) with a posterior probability >0.95 {*) and >0.99 in Bayes Empirical Bayes. 
N, Number of sequences aligned. 

Ls, Sequence length (in codons) after alignment gaps are removed. 
Parameter estimation and likelihood scores under site models. 
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Table 2 PGLS analyses of evolutionary rate and sequence length of CatSper 1 in relation to relative testis mass (RTM) 



Dependent variable 


Predictor 


n 


d.f. 


Slope 




F 


P 


\' 


pb 


pb 

nA=i) 


ES^ 


CL {-f 


CL {+f 


Omega 


RTM 


16 


14 


-0.049 


0.012 


0.139 


0.855 


0.999 


< 0.001 


1 


0.106 


-0.438 


0.649 


Omega 


Body mass 


16 


14 


0.0022 


0.539 


13.939 


0.0028 


0.999 


<0.001 


1 


0.052 


-0.363 


1.449 




Testes mass 






-0.0924 




0.14 


0.7144 








0.056 


-0.439 


0.647 


Sequence length 


RTM 


15 


13 


-7.214 


0.53 


14.682 


0.002** 


0.402 


0.257 


0.078 


0.925 


0.359 


1.491 


Sequence length 


Body mass 


15 


12 


0.149 


0.778 


8.895 


0.01 r 


0 


1 


0.007 


0.779 


0.214 


1.345 




Testes mass 






-21.159 




33.055 


0.0009** 








1.28 


0.714 


1.846 



P values < 0.05 (*) and < 0.01 (**) indicate statistical significance. 

value indicates phylogenetic effect. 
"^Significance of log-likelihood ratios for X against models with X-0 and A = 1 are shown. Statistically significant values are shown in bold. 
''Effect size (ES) calculated from the F-values. 

^Noncentral 95% confidence limits (CL). Confidence intervals excluding 0 indicate statistically significant relationships and are shown in bold. 



CatSperl evolution and sperm swimming velocity 

Given the important role of CatSper in sperm motility, 
we evaluated whether molecular and structural diver- 
gence of its N-intracellular domain has effects on sperm 
movement properties. No significant relationships were 
obtained when evolutionary rates of Catsperl were cor- 
related with parameters of sperm velocity (data not 
shown). On the other hand, significant negative correla- 
tions were obtained when CatSperl N-terminus length 
was analyzed with parameters measuring linear velocity 
(VSL, VAP, LIN, STR) and an overall sperm velocity 
component estimated from PC A (Table 3, Figure 4), 

Changes in histidine residues 

Because of its high histidine content, the N-intracellular 
region of CatSperl is thought to be involved in the pH- 
mediated regulation of the CatSper channel. Therefore, 
we evaluated whether the variation in histidine content 




— r~ 

0.5 



1.0 



1.5 



2.0 



Relative testes mass 



Figure 3 Relationship between relative testes mass and the 
length of CatSperl N-terminal region in rodents. Results of 
statistical analyses are given in Table 2. 



observed across rodent species is related to levels of 
sperm competition. We calculated the proportion of 
histidines in each sequence (Additional file 1) and 
performed PGLS analysis with relative testes mass as 
predictor; we found no significant correlation {F = 1.47, 
P = 0.25) (Additional file 5A). Likewise, no significant 
correlation was found between the proportion of histi- 
dines and parameters of sperm swimming velocity (data 
not shown). 

These results suggest that sperm competition has no 
influence in histidine content of rodent CatSperl. No 
significant relationship was observed between sequence 
length and histidine proportion (F = 3.26, P = 0.0941). 
Thus, differences in histidine content are not attribut- 
able to residues gains and losses by insertions and dele- 
tions. Instead, two global histidine gains were detected 
throughout the phylogeny: the first occurring in the 
common ancestor of murid species after the split from 
rat, and a second between Mus pahari and the remaining 
Mus species (Additional file 5). These observations suggest 
that the proportion of histidines in CatSperl is evolving 
following a phylogenetic pattern. 

Discussion 

Previous studies in primates and rodents revealed 
that the rate of indel substitutions in the first exon 
of Catsperl is higher than that in neutral genomic 
regions, suggesting that positive selection is promoting 
length variation in CatSperl [25,26]. Because CatSperl is 
a possible candidate for direct involvement in sperm 
competition (see Background), we assessed whether the 
fixation of indel substitutions in the N-terminal region of 
CatSperl is associated to this selective force. Our results 
revealed a significant negative correlation between the 
length of CatSperl N-terminus and relative testes mass, a 
reliable proxy of sperm competition. This finding pro- 
vides evidence that sperm competition has an influence 
on structural dynamics of CatSper and favors a shorten- 
ing of the N-terminal domain of this protein. It has been 
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Table 3 PGLS of CatSperl length in relation to sperm velocity parameters 



Dependent variable 


Predictor 


n 


d.f. 


Slope 




F 


P 




pb 


pb 


ES^ 


CL i-f 


CL (+) 


VCL 


Sequence length 


11 


9 


-0.975 


0.512 


1.966 


0.191 


0.754 


0.066 


0.362 


0.430 


-0.223 


1.083 


VSL 


Sequence length 


11 


9 


-2.375 


0.216 


10.480 


0.0089** 


0.667 


0.186 


0.103 


0.898 


0.245 


1.551 


VAP 


Sequence length 


11 


9 


-2.003 


0.164 


8.340 


0.016* 


0.556 


0.329 


0.110 


0.818 


0.165 


1.472 


ALH 


Sequence length 


11 


9 


0.044 


0.201 


2.265 


0.159 


0.528 


0.943 


0.188 


0.483 


-0.21 


1.176 


LIN 


Sequence length 


11 


9 


-0.016 


0.785 


32.795 


0.003*** 


0 


1 


0.039 


1.402 


0.709 


2.095 


STR 


Sequence length 


11 


9 


-0.009 


0.626 


15.06 


0.001*** 


0.66 


0.175 


0.161 


1.074 


0.381 


1.767 


BCF 


Sequence length 


11 


9 


0.016 


0.009 


0.085 


0.919 


1 


0.001 


1 


0.097 


-0.595 


0.789 


Overall Sperm Velocity 


Sequence length 


11 


9 


-0.195 


0.475 


10.03 


0.005** 


0.652 


0.128 


0.154 


0.920 


0.227 


1.613 



P values < 0.05 {*), < 0.01 (^^) and < 0.005 (^^^) indicate statistical significance. 

value indicates phylogenetic effect. 
"^Significance of log-likelihood ratios for X against models with X-0 and X-^ are shown. Statistically significant values are shown in bold. 
^Effect size (ES) calculated from the F-values. 

'^Noncentral 95% confidence limits (CL). Confidence intervals excluding 0 indicate statistically significant relationships and are shown in bold. 

VCL, Curvilinear velocity; VSL, Straight-line velocity; VAP, Average path velocity; ALH, Amplitude of lateral head displacement; LIN, Linearity; STR, Straightness; BCF, 

Beat cross frequency. 



reported that variation in molecular mass of SvsII, the 
major component of rodent copulatory plug, correlates 
positively with sperm competition levels across rodent 
species [27]. Given that most of the variation in the mo- 
lecular mass of SvsII is presumably produced through in- 
sertions and deletions, this was so far the only evidence 
suggesting that indels may explain adaptive length diver- 
gence in reproductive proteins under scenarios of sperm 
competition. Our results now provide strong support for 
the idea that sexual selection is responsible for structural 
variation in some reproductive proteins. 

The mechanisms generating insertions and deletions 
in the genome are not well understood. In any case, it 



o 
o 



o 

(f) 



o - 



> 

O c\J 



295 



300 



305 



310 



315 



Catsperl N-terminus length 
(number of amino acids) 

Figure 4 Correlations between sequence lengtii of tiie N-terminal 
region of CatSperl with overall sperm velocity. Results of statistical 
analyses for all velocity components are given in Table 3. 



has long been recognized that small deletions are more 
abundant than insertions of similar size in protein- 
coding sequences [46]. Our results agree with this evi- 
dence, because deletions outnumbered insertions in 
CatSperl. Moreover, we observed a variable number of 
indels across terminal branches, with some lineages 
showing insertion:deletion ratios that deviated from the 
expected ones. The most extreme cases were the species 
subjected to more intense sperm competition, namely 
Apodemus sylvaticus and Lemniscomys barbarus which 
showed almost an exclusive presence of deletions. It 
seems, therefore, that a higher fixation of deletions in the 
first exon of Catsperl is not only resulting from deletion 
bias as indel mechanism, but also that selective forces 
such as sperm competition are contributing to the short- 
ening of this region. 

CatSperl is important for the regulation of intracellu- 
lar Ca^"^ required for both activated and hyperactivated 
sperm motility [29-31] that is needed at different stages 
during transit in the female tract. Experiments using null 
mice revealed that CatSperl is essential for maintenance 
of sperm motility and hyperactivation [29,30], mainten- 
ance of intracellular ATP levels [47], progression beyond 
the oviductal sperm reservoir [48] and, ultimately, male 
fertility [29,49]. We thus tested whether length variation 
in the N-terminal region of CatSperl is linked to sperm 
swimming velocity. Our results showed a significant 
negative correlation of CatSperl N-terminus length with 
four parameters describing velocity (VSL, VAP, STR and 
LIN) and an overall sperm velocity component, revealing 
an association between the shortening of this region and 
increases in progressive motility. Our findings thus 
support a relationship between structural divergence 
of CatSperl and sperm movement among rodent species. 
Previous studies have shown evidence that when sperm 
competition increases so does sperm swimming speed 
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[14,17,50]. Swimming speed is a major determinant of 
fertilization success in sperm competition contexts [10,11]. 
Based on the correlation observed between relative testes 
mass, CatSperl sequence length, and sperm velocity pa- 
rameters, our results support the hypothesis that sperm 
competition favors shortening in the N-intracellular do- 
main of CatSperl leading to an increases of sperm swim- 
ming speed during transport of spermatozoa in the female 
reproductive tract and, as a consequence, increase the 
probabilities of fertilization success. 

Although a high number of amino acid replacements, 
driven by positive selection, were observed in the N- 
terminal region of CatSperl, we did not find evidence of 
an association between co ratios and levels of sperm 
competition or sperm velocity parameters. One possible 
interpretation of these results is that, whereas indel sub- 
stitutions seem to be a primary target of post-copulatory 
sexual selection, amino acid divergence may be influ- 
enced by multiple selective forces promoting changes 
not related to reproductive success. On the other hand, 
it is possible that the different species follow different 
mutation routes to increase fertilization ability, leading 
to lineage heterogeneity of adaptive evolution, either 
within the Catsperl gene or across multiple genes con- 
trolling the same traits. Therefore, an effect of sexual se- 
lection on the molecular evolution of CatSperl cannot 
be completely discarded despite the fact that our ana- 
lyses did not reveal such an association. 

The role of the N-intracellular region of CatSperl has 
not yet been clarified. Catsperl is a constitutively active 
unit of the CatSper channel, which is strongly potenti- 
ated by intracellular alkalinization [51]. An involvement 
of the CatSperl N-terminal region in pH regulation 
of the CatSper channel has been suggested because 
of its high content in histidine residues [51]. Our re- 
sults showed that variation in the proportion of histi- 
dines in the N-terminal region of rodent CatSperl is not 
associated to sperm competition levels, sperm velocity or 
sequence length. Instead, we observed that different mur- 
ine clades keep a constant amount of histidines, which 
suggests that successive histidine gains (or losses) have oc- 
curred through evolutionary time. Nonetheless, the acqui- 
sition (or loss) of histidines in CatSperl could be adaptive 
as almost half of the amino acids under positive selection 
include these residues. On the other hand, previous re- 
ports have revealed that the sensitivity of channels to 
intracellular pH is regulated by histidine residues located 
in particular intracellular domains [52,53]. It is therefore 
possible that pH sensitivity of CatSperl is related to the 
location of histidine residues in the N-terminal domain ra- 
ther than to their abundance. 

Another suggested functional model for the intracellular 
N-terminal region of CatSperl is one based on the ball- 
and-chain model of IC channels [25]. This is plausible 



because the CatSper channel seems to resemble K"^ 
channels rather than Ca^^ channels. According to the ball- 
and-chain model, the N-terminal domain of CatSper 
would act to physically block the ion channel pore region 
causing the inactivation of the channel. Variations in 
length of this intracellular domain could be relevant for 
the activation/inactivation rate of the channel due to 
spatial restrictions and, hence, may affect sperm motility 
[54]. In any case, there could be multiple mechanisms 
regulating Ca^^ influx through the CatSper channel and 
further molecular and biochemical approaches clarifying 
the role of the N-intracellular domain will be necessary to 
determine the functional consequences of indel and amino 
acid substitutions in this region as well as which changes 
may be advantageous in terms of sperm competition. 

Our study provides the first evidence of how sperm 
competition is able to influence traits important for 
fertilization success by promoting structural changes in 
a sperm-specific protein. Sperm-specific proteins have 
evolved through large changes in protein length, with a 
larger number of indel events, in comparison with genes 
from other tissues in mammals [55]. Whenever indel 
substitutions do not imply drastic changes affecting pro- 
tein function, structural divergence may be a source of 
variation able to promote advantageous changes more 
efficiently than nucleotide replacements. Therefore, it is 
possible that indel substitutions constitute primary tar- 
gets of positive selection in some reproductive genes. 

In any case, caution should be exercised when search- 
ing for links between genotypic and phenotypic traits. 
As mentioned above, in the context of sexual selection, 
species may exhibit several different mutational routes 
associated to increase in reproductive fitness. Moreover, 
many traits evolving under sexual selection are likely to 
be regulated by multiple genes. Analyses of groups of 
genes potentially associated with particular phenotypic 
traits could provide insights of the evolutionary pro- 
cesses of traits important for fertilization. 

Our findings also contributed to identify a starting 
point for future work investigating the influence of evo- 
lutionary forces on structural divergence of protein coding 
sequences. It has been recently discovered that structural 
variants between close orthologous sequences are rare in 
the mouse and that a very low proportion of these lead to 
phenotypic changes [21]. Structural divergence is much 
more prevalent in duplicated genes, leading to the gener- 
ation of functionally distinct paralogs [56]. A good example 
to address these issues may be CatSper itself and, thus, it 
could be worthwile expanding the present studies to other 
members of the CatSper gene family. 

Conclusions 

Our study has revealed that length variation in the N- 
terminal region of CatSperl, resulting from an excess of 
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indel substitutions favored by positive selection, is asso- 
ciated to relative testes mass, suggesting that sperm 
competition is a primary force influencing the structural 
evolution of this intracellular domain of the CatSper 
channel In addition, our results have shown that vari- 
ation in the length of CatSperl N-terminus is associated 
with changes in sperm swimming velocity, an essential 
sperm function for successful fertilization. Altogether, 
the amino terminal region of CatSperl seems to be an 
important target of sexual selection whose structural 
changes may result in faster sperm which are more likely 
to win the race to fertilization. To the best of our know- 
ledge, this is the first observation of protein structural 
divergence linked to evolution of traits important for 
sperm competition. 

Availability of supporting data 

CatSperl sequences were deposited in GenBank (https:// 
www.ncbi.nlm.nih.gov/genbank/) with accession numbers 
KJ652954-KJ652968. Catsperl nucleotide matrices and 
resulting phylogenetic trees are available in the TreeBASE 
repository (http://www.treebase.org) with accession URL: 
http://purl.Org/phylo/treebase/phylows/study/TB2:S15721 
[57]. Supporting data are also included as additional files. 
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